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pi i' Abstract. We use a weak Gibbs property and a weak form of 

•^r I specification to derive level-2 large deviations principles for sym- 

bolic systems equipped with a large class of reference measures. 
This has applications to a broad class of coded systems, includ- 
ing /3-shifts, S-gap shifts, and their factors. Our techniques are 
suitable for adaptation beyond the symbolic setting. 
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1. Introduction 



We introduce criteria for a symbolic system to satisfy the large de- 
viations principle. These criteria are motivated by the 'non-uniform' 
structure of our main examples - /3-shifts, S'-gap shifts, and their fac- 
tors - but apply more generally. We prove the following main result. 
G^ \ (See [|2]for precise definitions.) 

•/^ ■ Theorem A. Let (X, a) he a shift on a finite alphabet, m a Borel 

■^ ! probability measure on X, and (/? : X — > R a continuous function. Let 

C be the language of X. Suppose there exists a set Q <Z C such that 

[A.l] Q has (W)-specification with good concatenations; 

[A. 2] C is edit approachable by Q; 

[A. 3] m is Gibbs for ip with respect to the collection Q. 



H ■ Then (X, a) satisfies a level-2 large deviations principle with reference 

■ ■ measure m and rate function q"^ : A4.{X) — )■ [— oo,0] given by 

,^ ^ .h{fi) + Jipd^i-P{ip) fieM^iX), 

-oo otherwise. 



Roughly speaking, [A.l] means that words from Q can be 'glued 



together' with uniformly bounded gaps to obtain another word in Q. 



The condition [A. 2] means that any word w E C can be transformed 
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means 



into a word in Q without making too many edits. Finally, [A. 3] 

that m satisfies an upper Gibbs bound on all cylinders, and a lower 

Gibbs bound on cylinders corresponding to words in Q. 

Context of Theorem [Al Large deviations theory originated in prob- 
ability theory and statistical mechanics, with a long history dating to 
the 1938 theorem of Cramer (which applies to i.i.d. random variables) 
and ultimately to a heuristic 1877 observation of Boltzmann [151 fTB] . 

In the context of dynamical systems, large deviations results have 
received a great deal of attention since their introduction in the 1980's 
[301 1371 [381 [m |22l m]. The results we study quantify the rate of 
convergence of empirical averages relative to a fixed reference measure 
m. That is, one studies the rate of decay of m{x \ Sn{x) G U}, where 
Sn{x) is the empirical measure of order n associated to the point x, and 
[/ is a suitable subset of the space of all probability measures on X. 
One may also consider level- 1 large deviations principles, which study 
'm{x I -Sn^{x) G V} for some fixed observable (p and some l^ C M. 
Level- 1 results for continuous observables follow from level-2 results via 
the contraction principle (see [TO]). 

To place our results in context, we summarize the main approaches 
to large deviations in dynamical systems. We recommend the intro- 
ductions of [m [271 ESI Uni [5] for further references and discussion. The 
Large Deviation Principle can be divided into two steps: 

(1) Upper Bound: bound lim-logm{x | Sn{x) G U} from above 
in terms of a rate function. For an optimal result, we want to 
give the best possible description of the rate function. 

(2) Lower Bound: bound lim-logm{x | £n{x) G U} from below, 
ideally in terms of the same rate function as the upper bound. 

It is often the case that the upper bound is easier to establish than 
the lower bound, requiring only an upper Gibbs bound on m (see §4.ip . 
The task of obtaining a lower bound is generally carried out using one 
of the following three methods: 

(1) The 'orbit-gluing approach' relies on direct constructions based 
on the specification property (or one of its variants). This is 
the approach taken in this paper. 

(2) The 'functional approach' relies on differentiability of a certain 
functional on the space of observables, which can be related to 
uniqueness of equilibrium states. 

(3) The 'tower approach' relies on relating the original system to 
a countable state Markov shift via a tower construction. The 
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focus here is typically on the case where the reference measure 
is either volume or an SRB measure. 

The first two approaches (orbit-gluing and functional) are funda- 
mentally related to the theory of thermodynamic formalism, and have 
their full power in settings where this theory is well understood, such 
as uniformly hyperbolic systems. These approaches yield exponential 
rates of decay when they apply. The tower approach has been used 
to deal with a broad class of non-uniformly hyperbolic systems, where 
complete thermodynamic results are not always available and the rate 
of decay may be either exponential or polynomial. 

We note that Araiijo and Pacifico [T] have large deviations results 
based on the notion of hyperbolic times. These results apply to certain 
non-uniform and partially hyberbolic systems and do not fit into the 
classification above. We now recall some of the key results obtained in 
the literature using the three approaches. 

The tower approach. The main advantages of the tower approach are 
its flexibility and the fact that it is the only method which yields results 
on both exponential and sub-exponential rates of decay. Quoting Rey- 
Bellet and Young [35], "The tower construction enables one to treat - 
in a unified way - a larger class of dynamical systems without insisting 
on optimal results." The tower results of which we are aware focus on 
the case when the reference measure is either SRB or Lebesgue. 

The tower approach dates back at least to Keller and Nowicki [2T], 
who studied large deviations for Collet-Eckmann unimodal maps by 
using quasi-compactness of the transfer operator associated to a suit- 
able tower. For Manneville-Pomeau maps, the level-1 large deviation 
principle for Holder continuous observables was obtained by Pollicott, 
Sharp, and Yuri [31], in the cases where exponential decay rates ap- 
ply, and level-2 results for both polynomial and exponential decay were 
later obtained by Pollicott and Sharp [5B] . 

For a broader class of non-uniform examples, level-1 results for Holder 
continuous observables have been obtained using the towers introduced 
by Young in [l5]. Results for exponential decay were given by Rey- 
Bellet and Young in [35] and by Melbourne and Nicol in [27], using 
results on quasi-compact operators [19] to obtain local differentiabil- 
ity of the logarithmic moment generating function, whose Legendre 
transform gives the rate function. Using martingales, Melbourne and 
Nicol [57] also gave the first results for polynomial decay, which were 
improved by Melbourne in |26] . 

The results mentioned so far all have a 'local' aspect, which means 
that they describe the rate of decay of m{x \ -Sn^{x) E V} only when 
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the set V is the complement of a sufficiently small neighbourhood of 
the expected value (the ?Ti-a.e. limit of -Sn'^). 'Global' large deviations 
results, such as the ones we prove in this paper, do not hold without 
further conditions on the tower (see [U §5] for examples). For towers 
satisfying an additional 'nonsteep' condition, Chung [4] has obtained 
a full level-2 large deviations principle, successfully removing both the 
'local' assumption and the regularity assumption on the observables. 
He describes the rate function using Lyapunov exponents. These results 
have been applied to quadratic maps by Chung and Takahasi [3]. 

The functional approach. The technical and historical aspects of the 
functional approach are explained in detail in the introduction of |10j . 
This approach is similar in spirit to the original Gartner-Ellis the- 
orem [m [H] from probability theory, and was first implemented in 
dynamical systems by Takahashi [371 EH] • A general formulation of the 
functional approach was given by Kifer [22]. The upper bound requires 
a weak version of the Gibbs property (such as [lOl (1.5)]), and the key 
requirement for the lower bound is the existence of a dense subspace 
W C C{X) such that every ip &W has a unique equilibrium state. 

The functional approach has been successfully applied to rational 
maps of the Riemann sphere. For hyperbolic rational maps, with the 
measure of maximal entropy as the reference measure, this was carried 
out by Lopes [25] . For the broader class of topological CoUet-Eckmann 
rational maps, Comman and Rivera-Letelier [10] showed that every 
Holder continuous potential ip has a unique equilibrium state /i<^, and 
the system satisfies level-2 large deviations with reference measure /i<^. 

In the more abstract setting of this paper, so far there are no known 
axiomatic conditions to verify the dense subspace condition. This con- 
dition is known to hold for /3-shifts [SI 112] and also follows for S'-gap 
shifts from the arguments in ||5] of this paper, but so far the proof 
in each case is specific to that particular example. It is a problem 
of independent interest to identify dynamical systems for which the 
dense subspace condition holds, although it may be challenging (or 
even impossible) to verify in some situations where we expect the large 
deviations results of this paper to apply. 

The orbit-gluing approach. The idea of this method is that if we have 
the ability to 'glue' a collection of finite orbit segments into a single 
orbit segment, subject to a controlled amount of error, then we can 
obtain lower estimates via a constructive proof. The property which 
allows us to glue is called the specification property, and there are 
many variants on its precise definition, many of which are surveyed 
in [13]. The orbit-gluing approach to large deviations dates back at 
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least to work of Follmer and Orey [17], who considered full Z'^-shifts. 
The presence of phase transitions for d > 2 renders the orbit-gluing 
approach preferable to the functional approach in this setting. Results 
for full Z'^-shifts were also obtained by Eizenberg, Kifer, and Weiss [13] . 

The benchmark result in the orbit-gluing approach is that if a topo- 
logical dynamical system (X, /) satisfies the specification property, and 
an invariant measure m on X satisfies the Gibbs property, then {X, f) 
satisfies the large deviations principle with reference measure m. A 
level-1 result in this form was given by Young [HI Theorem 1]. 

The specification property holds for uniformly hyperbolic dynamical 
systems, including topologically mixing Anosov diffeomorphisms and 
subshifts of finite type. In this setting, the Gibbs property can be 
verified for equilibrium measures corresponding to Holder continuous 
potentials. The specification property and the Gibbs property are uni- 
form global assumptions, and thus quite restrictive: in particular, they 
fail to hold for generic non-uniformly hyperbolic systems [28] . 

To apply the orbit-gluing method beyond this well understood set- 
ting, the challenge is to find appropriate conditions to replace the uni- 
form ones. A key breakthrough in this direction is the work of Pfister 
and Sullivan [21] , who used a weakened form of the specification prop- 
erty to prove that all /3-shifts verify the large deviations principle (using 
the unique measure of maximal entropy as the reference measure). Us- 
ing a similar approach, the third named author [13] used a suitable 
weak specification property to prove that ergodic automorphisms with 
positive topological entropy satisfy large deviations with Haar measure 
as the reference measure. The work of Varandas takes a similar phi- 
losophy (although using very different weak specification and Gibbs 
properties) in a smooth setting [H]. As far as we know, these are the 
only large deviations results which have been derived from weakened 
specification properties. 

Results in this paper. The primary goal of this paper is to establish 
conditions under which a class of dynamical systems with non-uniform 
structure can be treated using the orbit-gluing approach. A significant 
difficulty in applying either the functional approach or the orbit-gluing 
approach in the non-uniform setting is to obtain the necessary thermo- 
dynamic results, in particular existence and uniqueness of equilibrium 
states and some version of the Gibbs property. The recent work of 
[TJ [8] has provided the necessary thermodynamic groundwork to make 
this approach tractable for a large class of symbolic systems, by giving 
conditions for a potential (f to have a unique equilibrium state with 



the weak Gibbs property [A. 3] - we state these in Theorem 13.11 
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Our main result is that we can obtain a full level-2 large deviations 
principle from non-uniform versions of the Gibbs property and the 
specification property similar to those introduced in [7l|8]. 

We work in the symbolic setting because our motivating examples are 
symbolic and because the exposition and computations are simpler in 
this context. Our approach is suitable for adaptation to non-symbolic 
topological dynamical systems, where analogues of some of the ideas 
in this paper have already been introduced 0. 

The criteria we introduce can be verified for many shift spaces in 
the class of coded systems, including S'-gap shifts, /3-shifts, and their 
factors. As shown in [7], there are S'-gap shifts that do not satisfy 
any of the weak specification properties discussed in [13], including the 
almost specification property of [3T| HO] . Thus, none of the previous 
large deviations results based on the 'orbit-gluing' approach can be 
applied to arbitrary S'-gap shifts, so this is a completely new result for 
this family of shift spaces (we expect that existing tower methods would 
yield a 'local' level-1 result for arbitrary S-gap shifts, but this has never 
been carried out). For /3-shifts, our result is a generalisation of pTj . 
and the novel application here is that we are able to use a large class 
of reference measures, instead of just the measure of maximal entropy. 
For subshift factors of /3-shifts and S'-gap shifts, our result is completely 
new. Furthermore, the main results are formulated axiomatically; thus 
one may construct many more examples of shift spaces to which our 
results apply, and we expect that both the symbolic result and future 
generalisations to the non-symbolic setting will yield more applications 
in the future. 



Layout of the paper. In ^ we establish our definitions. In ^ we 
give various results that follow from Theorem |X] by using the thermo- 
dynamic results developed in [TJ E]- In particular, we give applications 
to /3-shifts, S-gap shifts, and their factors. 

In §U we prove Theorem [Aj The proofs of lemmas which are not 
proved in the body of the text appear in ^ In §3 we give proofs 
that the examples (/3-shifts and S'-gap shifts) satisfy the conditions 
of Theorem \M To apply Theorem |X] to every equilibrium state for 
a Bowen potential on an S-gap shift, we establish an intermediate 
result of independent interest which states that these equilibrium states 
always have positive entropy. 

In an appendix, we fill in the details of the proof that (S)-specification 
can be replaced by (W)-specification in the main result of [8] (quoted 
here as Theorem 13. ip . which is important for our examples. 
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2. Definitions 

2.1. Large deviations principles. Let {X,d) be a compact metric 
space and f:X — )■ X a continuous map. Denote by Ai{X) the set 
of all Borel probability measures on X with the weak* topology. This 
topology is induced by the metric 






n=l 



^ \\Yr, 



where {</?„} C C{X) is a countable dense subset. Let A^/(X) C Ai{X) 
be the set of /-invariant Borel probability measures, and let A^^(X) C 
Aif{X) be the set of ergodic measures. Given x & X, consider the 
empirical measures 

^ n— 1 
^n{x) = — / Sf]^, 
3=0 

where 6y is the point measure concentrated at y. For a fixed x, the 
sequence of empirical measures Sn{x) converges to Aif{X). Large de- 
viations theory quantifies some aspects of that convergence. 

Definition 2.1. We say that the system {X, /) satisfies a level-2 large 
deviations principle with a reference measure m G Ai{X) and a rate 
function q: Ai{X) — )■ [— oo,0] if q is upper semicontinuous, 

liminf — logm ({a; G X \ £n{x) G U}) > supg(/i) 

holds for any open set U C A^(X), and 

limsup — logm ({x G X | £^n(a;) £ -^}) < supg(/i) 

holds for any closed set F C A^(X). 

2.2. Shift spaces and languages. Let A be a finite set and A^ (resp. 
A^) be the set of all one-sided (resp. two-sided) infinite sequences on the 
alphabet A, endowed with the standard metric d{x, y) = 2"**^^'^^ where 
t{x,y) = min{|/c| | Xk ^ yk}- The shift map on A^ is a: xia:2 ■ ■ ■ i— )■ 
X2X3 ■ ■ ■ , and the shift map on A^ is defined analogously. A subshift 
is a closed a-invariant set X C A^ or X C A^. All of the results and 
proofs in this paper apply equally to one-sided and two-sided shifts, so 
we treat both cases simultaneously. 

The language of X, denoted by £ = C{X), is the set of all finite 
words that appear in any sequence x G X - that is, 

£(X) = {weA*\ [w] ^ 0}, 
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where A* = lJn>o ^" ^^^ i"^] ^^ ^^^ central cylinder for w, which in the 
one-sided case is the set of sequences x & X that begin with the word 
w. Given w & C, let \w\ denote the length of w. For any collection 

V C C, let Vn denote {w ^ V \ \w\ = n}. Thus, £„ is the set of 
all words of length n that appear in sequences belonging to X. Given 
words u, V, we use juxtaposition uv to denote the word obtained by 
concatenation. 

A decomposition of £ is a collection of three sets of words C^, Q, C^ C 
C such that given any w E C, there exist u^ G C^,f G G,u'^ G C^ 
such that w = u^vu^. We write C = C^QC^ when the language can 
be decomposed in this way. We make a standing assumption that 
G C^, Q, C^ to allow for words in C that belong purely to one of the 
three collections (this is also implicit in [3 E])- 

Once a decomposition C = C^QC^ has been fixed, we consider for 
each M G N the set 

(2.1) g^^ := {uvw eC\v eg,ueC^,w eC, \u\ < M, \w\ < M}. 

Note that C = Umgn ^*^' ^^ ^^^^ defines a filtration of the language of 
X. 

2.3. Entropy and pressure for shift spaces. Given a collection 

V C C, the entropy of V is 

h{V):= Ih^ -log#I?„, 

n— >-oo n 

where P„ = {w & V \ \w\ = n} . The entropy of an invariant measure 
/i G Ma{X) is 



h{n) := lim — > — n[w] log fx[w]. 

weCn 

For a fixed potential function ip G C{X), the pressure of P C £ is 
P{V,ip):= li^ -hgAniV,^), 

n— >-oo 77, 

where 

A„(I?, ^) = ^ e'^P-sH Snvix) 
weVn 

and S'„v9(x) = Ylt=o f{(^^x)- We write -P(v5) = -P(>C, </)). 

We will be primarily concerned with potentials having some extra 
regularity: we say that (p has the Bowen property on V if there is \^ G M 
such that for every n G N, every w G Vn, and every x,y E [w], we have 
\Sn^{x) — Sn^{y)\ < V. In particular, if ip is Holder continuous then 
it has the Bowen property on every V G C. 
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2.4. Specification. We define tlie specification properties tliat appear 
in tliis paper, and tlie relationsliips between tliem. In [7], we introduced 
tlie following definitions. 

Definition 2.2. Given a shift space X and its language C, consider 
a subset Q G C. Fix r G N; any of the following conditions defines a 
specification property on Q with gap size r. 

(W): For all m G N and w^, . . . , w™ G ^, there exist w^, . . . , t)"*"^ g C 
such that X := w^v^w^u^ ■ ■ ■ ij"^-'^'uj"^ g C and |w*| < r for all i. 

(S): Condition (W) holds, and in addition, the connecting words v^ 
can all be chosen to have length exactly r. 

In the case Q = C, these definitions are equivalent to the well known 
weak specification, respectively specification, property of the shift. 

In this paper, we also consider the following specification properties. 

Definition 2.3. Given a shift space X and its language C, consider a 
subset Q G C,we say that Q has (W)- specification with good concatena- 
tions, or simply (gcW)- specification if for all ?7i G N and w^, . . . , w^ G 
Q, there exist w^, . . . , v"^'^ G C such that 

(1) \v'^\ < T for all i, and 

(2) w'f * ■ ■ ■ v^~^w^ G Q for every 1 < i < j < m. 

The meaning of (gcW)-specification is essentially that words in Q 
can always be glued together to give new words in Q. 

Definition 2.4. Given a shift space X and its language C, consider a 
subset Q G C. We say that Q has specification with transition time 0, 
or (O)-specification, if for all u,w G Q, we have uw G Q. We may also 
refer to (O)-specification as the free concatenation property. 

It is clear from the definitions that (O)-specification implies (gcW)- 
specification, and that (gcW)-specification implies (W)-specification. 
The advantage of (O)-specification is that it is a simple criterion, which 
can be verified easily for natural collections Q associated with /3-shifts, 
5'-gap shifts, and more general coded systems. 

2.5. Edit Approachability. First we introduce the edit metric (some- 
times known as the Damerau-Levenshtein metric) on C 

Definition 2.5. Define an edit of a word w = Wi ■■■ Wn G C to he a. 
transformation of w by one of the following actions, where u^ G C are 
arbitrary words and a,a' G A are arbitrary symbols. 

(1) Substitution: w = u^av? ^-^ w' = u^a'v?. 

(2) Insertion: w = u^u^ i-> w' = v}a'v?. 
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(3) Deletion: w = u^au"^ H- w' = u^u"^. 

Given v,w G C, define the edit distance between v and w to be the 
minimum number of edits required to transform the word v into the 
word w: we will denote this by d{v,w). 

The following lemma about the size of balls in the edit metric will 
be crucial for our entropy estimates. 

Lemma 2.6. There is C > such that given n G N, w G Cn, and 
6 > 0, we have 

(2.2) #{t; G C I d{v,w) < Sn} < Crf {e^^e-^^^'^^Y . 

Now we can introduce our key new definition, which requires that 
any word in C can be transformed into a word in Q with a relatively 
small number of edits. 

Definition 2.7. Say that a non- decreasing function (7: N — )■ N is a 
mistake function if ^^ converges to 0. We say that C is edit approach- 
able by Q, where ^ C £, if there is a mistake function g such that for 
every w E C, there exists v E Q with d{v,w) < g{\w\). 

An important consequence of edit approachability is that we can re- 
place sufficiently long words in C with words in Q in such a way that 
estimates on Birkhoff averages, and thus estimates on empirical mea- 
sures, can be well controlled, while at the same time, (12. 2p guarantees 
that not much entropy is lost this way. 

Control on the Birkhoff averages is given by the following lemma. 

Lemma 2.8. For any continuous function </?: X — )■ R and any mistake 
function g{n), there is a sequence of positive numbers (5„ — )■ such that 
if x,y G X and m,n E N are such that d{xi ■ ■ ■ Xn, 2/i ■ ■ ■ ym) < g{n), 
then |iS'„,(/?(x) - ^Sm^{y)\ < 5^- 

It follows from Lemmas 12. 61 and 12. 81 that edit approachability implies 
that the collection Q carries full pressure for every continuous potential 
(this result is included for independent interest although we do not use 
it in this paper). 

Proposition 2.9. If C is edit approachable by Q, then P{Q,ip) = P{(p) 
for every ip G C{X). 

Remark 2.10. If, in Definition 12. 7[ we replace d{v,w) with the Ham- 
ming distance c/nam between v and w, we could say that C is Ham- 
ming approachable by Q. Clearly, if C is Hamming approachable by 
Q, then £ is edit approachable by ^. If £ is Hamming approachable 
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by Q, and Q satisfies the specification property, then it is easy to see 
that the symbohc space satisfies the almost specification property of [3 
§3.3]: there exists a mistake function g so that for any words u,v & C, 
there are words u',v' E C so that u'v' G £, (iHam(w,'u') < g{\u\) and 
dm.m{v,v') < g{\v\). 

Hamming approachabihty is too strong an assumption for our apph- 
cation to S-gap shifts, since there are examples of 5'-gap shifts without 
the almost specification property [3 §3.3]. When £ is edit approach- 
able by Q, and Q has (W)-specification, we see that the symbolic space 
satisfies a weaker version of almost specification, given by replacing 
Hamming distance with edit distance in the definition of almost speci- 
fication. 

2.6. Gibbs properties. The standard Gibbs property for a shift space 
says that a measure m G Ai{X) is Gibbs if there are constants K, K' > 
such that 

(2.3) K < ^^^-^ < K' 

for all X G X. We will require that the upper bound hold uniformly, 
while the lower bound will only be required to hold when xi ■ ■ ■ x„ G ^. 
More precisely, we make the following definition for a collection Q G C. 

Definition 2.11. A measure m G Ai{X) is Gibbs with respect to Q if 
there are constants K^ K' > such that 

(2.4) m[xi ■■■Xn]< K'e-"^(^)+^"^(^) 
for every x E X and n G N, and 

(2.5) m[xi ■■■Xn]> i^e-'^^(^)+^"^(^) 
whenever x E X and n G N are such that Xi- ■ ■ Xn E Q. 



Definition 12.111 is property [A. 3 of Theorem |X1 Theorem 13.11 pro- 



vides examples of measures satisfying this definition. 

2.7. Properties under factors. One advantage of our techniques is 
that they behave well under factors. We let E be a shift space, and 
we let Q C >C(S). Suppose that X is a topological factor of S, that is, 
there exists a continuous surjective map vr: S — > X such that a o n = 
71 o a. By [231 Theorem 6.29], tt is a block code: there exist r G N and 
ip: C-2r^\ -^ -^5 where A is the alphabet of X, such that 

This induces a surjective map \1/: £(S)„+2r — ^ '^(X)„, by 

^(Wl ■ ■ ■ Wn+2r) = ^(Wl ■ ■ ■ W2r+l)^(w^2 " ' " W2r+2) " " " ^(^n " " " Wn+2r)- 
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We set Q = "^{Q). The key to our study of X is that Q inherits a 



number of good properties of Q, including in particular [A.l], [A. 2 
and the condition (I) that appears in Theorem 13.11 below. 



Lemma 2.12. Let Q C £(S) and Q C C{X) he as above. 

(1) If Q satisfies [A.l], then Q satisfies 

then Q satisfies 



(2) If Q satisfies 

(3) If Q satisfies 



[A.2J 



(I), then Q satisfies (I) 



[A.l] 



[A.2] 



Furthermore, iiC^QC^ is a decomposition for C(T,), there is a natural 
decomposition for C{X). We define C^ by taking '^{C^C2k(J^)), and C^ 
by taking \E'(£2fc(S)C*). It is easy to check the following lemma. 

Lemma 2.13. If C^QC^ is a decomposition for C{T,) , then C^QC^ is a 
decomposition for C{X). If h{CP U C) = 0, then h{CP U C") = 0. 

3. Consequences of Theorem \M 

3.1. Unique equilibrium states. The following result from [TJ [8] 
provides unique equilibrium states which satisfy the weak Gibbs prop- 



erty [A.3][ and is our key tool for finding reference measures to which 
Theorem lAI applies. Roughly speaking, Conditions (I) (II) state that 
C^ and C^ contain all obstructions to specification (for the system) and 
regularity (for the potential), while Condition (III) states that these 



obstructions carry smaller pressure than the whole system. 

Theorem 3.1 ([8], Theorem C and Remark 2.2). Let {X,a) be a sub- 
shift on a finite alphabet and (f G C{X) a potential. Suppose there 
exist collections of words C^,Q,C^ C C such that C^QC^ = C and the 
following conditions hold: 

(I) Q^ has (W)- specification for every M G N; 
(II) (fi has the Bowen property on Q ; 

(III) p(cpuc^^) <p{^). 

Then ip has a unique equilibrium state m^, andm^ is Gibbs with respect 



to G- In particular, m^ satisfies [A. 3]. 



In [S], the stronger condition of (S)-specification is assumed in (I) 



but the proof goes through with only minor modifications, which we 
present these in Appendix |Al Combining Theorem \K\ and Theorem 13. H 
we immediately obtain the following result. 

Theorem B. Let X be a subshift on a finite alphabet and ip G C{X) a 
potential. Suppose L has a decomposition C = C^QC^ satisfying [A.l], 
[A.2] arid (I) - (III) ■ Then writing m^ for the unique equilibrium state 
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ofip, the system {X,a) satisfies a level-2 large deviation principle with 
reference measure m^ and rate function q'^ given by fll.ip . 

For a shift space X and a collection of words C C £, it is typically 
much easier to verify h{C) < h{X) than P{C,(p) < P{X,(p). For (3- 



shifts, it was shown in [HI Proposition 3.1] that (III) holds for every 
Bowen potential ip, and we show in ^that this is also true for S-gap 
shifts. However, for other shift spaces where no analogous argument is 
available yet, the following lemma is a convenient way to ensure that 



(III) holds for a large class of functions. 



Lemma 3.2. Suppose X is a shift space and C C C{X) is a collection 
of words such that h{C) < h{X), and let ip: X ^^ W. If ip satifies the 
bounded range condition 

(BR) supif-mi(f<h{X)- h{C) , 

thenP{C,ip) <P{X,ip). 



Often we can take h{C) = 0, in which case the condition fIBRp on (p 
reduces to the condition 

(BRq) sup (f — ini ip < h{X) . 

3.2. Factors. Our results are well behaved under the operation of 
passing to a subshift factor and we have the following general theo- 
rem. 

Theorem C. Let T, be a subshift on a finite alphabet, and suppose that 



C = C{Tj) has a decomposition C = C^QC^ satisfying [A.l] , [A. 2] and 



(I) Assume further that h{C) = 0, where C = C^ U C^. Let X be a 
subshift factor of S and ip: X ^ M. be a continuous function satisfying 
(IBRqD and the Bowen property. Then ip has a unique equilibrium state 



m^, and the system (X, a) satisfies a level-2 large deviation principle 
with reference measure rrii^ and rate function q'^ given by f ll.ip . 

3.3. Coded systems. Our examples all fall into the class of coded 
systems, which are shift spaces whose language can be obtained by 
freely concatenating a countable collection of generating words G. That 
is, X is a coded system if there exists a countable collection of finite 
words G such that the language C{X) is characterised as follows: a 
word V is contained in C{X) if and only if there are w^, ■ ■ ■ ,w^ G G 
such that w is a subword oi w^ ■ ■ -w"". 

For coded shifts we may take Q to be the set of finite concatenations 



of generators from G. Then [A.l] is automatically satisfied and the 



brief argument of [TJ §4] shows that (I) holds. 
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Theorem D. Let X be a coded system, m a Borel probability measure 
on X, and ip: X ^ W a continuous function. Let G C C{X) be a col- 
lection of generators for X and let Q be the set of finite concatenations 



of words in G. If [A. 2] and [A. 3] are satisfied, then {X^a) satisfies 
a level-2 large deviation principle with reference measure m and rate 
function q"^ given by (11. ip . 

The foUowing resuh follows immediately from Theorem l3.lt Theorem 
[Dl and Lemma 13.21 

Theorem E. Let {X,a) be a coded system with generating set G. Let 
Q be the collection of finite concatenations of generators, and let C be 
the collection of prefixes and suffixes of generators. If h{C) < h{X) 
and if C is edit approachable by Q, then every potential (f with the 
Bowen property and the bounded range condition (IBRI) has a unique 
equilibrium state satisfying the weak Gibbs property [A. 3] and the large 
deviations principle in TheoremlAl 

3.4. /3-shifts and S-gap shifts. Our main examples are the /3-shifts, 
the S'-gap shifts, and their subshift factors. For all of these examples 
we can take h{C) = 0, and for the /3-shifts (in [HI Proposition 3.1]) and 
S'-gap shifts (in §5.ip . we can show that P{C, ip) < P{(p) for all Bowen 
potentials (f, which removes the need for the bounded range condition. 
For factors of /3-shifts and S'-gap shifts, we do require the additional 
assumption ( [BRpP on the potential (f at present. 



3.4.1. S-gap shifts. An S'-gap shift ZI5 is a subshift of {0, 1}^ defined 
by the rule that for a fixed S C {0, 1, 2, ■ ■ ■ }, the number of O's between 
consecutive I's is an integer in S. More precisely, the language of S5 
is 

{0"10"nO"n ■ ■ ■ 10"nO™ | n^ e ^ for all l < i < A; and n,me N}, 

together with {0" | ra G N}, where we assume that S is infinite (when 
S is finite, S5 is sofic and can be analysed without the techniques of 
this paper). The language for S^ admits the following decomposition: 

g = {o"no"no"n ■ ■ • 10"*1 | n^ G S for aU l < z < k}, 
CP = {0"1 \n^S}, 
C^ = {0" I n G N}, 

which was first studied in [7] . We verify in §5.11 that this decomposi- 
tion satisfies Conditions A.l , A. 2 , and (I) (III) for every Bowen 
potential ip. 
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3.4.2. (3-shifts. Fix /3 > 1, write b=\/3], and let w^ G {0, 1, ■ ■ ■ , 6-1}^ 
be the greedy /3-expansion of 1. Then o;^ satisfies 

oo 

j=i 

and has the property that a^{u^) :< u^ for all j > 1, where ^ denotes 
the lexicographic ordering. The /3-shift is defined by 

E^ = {x G {0, 1, ■ ■ ■ , 6 - 1}^ I a^{x) ^ u^ for all j > l} . 

The first and second author showed in [71 E] that the language for S 



admits a decomposition i2(S^) = QC^ that satisfies (I) (III) for every 



/3 



Bowen potential (p. In §5.2l we briefiy review the construction and show 



that conditions [A.l] and [A. 2] are also satisfied. 



3.4.3. Results for examples. We collect our results as applied to these 
examples in the following theorem. We say that a subshift is non-trivial 
if it is not a single periodic orbit. We proved in jTl Proposition 2.4] 
that a non-trivial subshift factor of a /3-shift or 5'-gap shift has positive 
entropy. 

Theorem F. Let X and (p be one of the following: 

(1) X is a P -shift or an S-gap shift, and (p has the Bowen property; 

(2) X is a non-trivial subshift factor of a P-shift or an S-gap shift, 
and (fi satisfies dBRoD and the Bowen property. 

Then ip has a unique equilibrium state m^p, and (X, a) satisfies a level 
2 large deviation principle with reference measure m^ and rate function 
q: A^(X) —7- [—00,0] given by 

I /i(/i) + I Lpdji — Pi'p) jJi is a-invariant, 
I — oo otherwise. 

In particular, taking for our reference measure the unique measure of 
maximal entropy tuq (since </? = always satisfies ( |BRo[ ) j, the system 
satisfies a level 2 large deviation principle with rate function given by 

. , j /i(/i) — /itop(^5 0") fi is a-invariant, 
I — oo otherwise. 

To the best of our knowledge, the only statement above which was 
previously known is the case when X is a /3-shift and mo is the measure 
of maximal entropy [31] (apart from the exceptional set of special cases 
above where X has specification). 
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3.5. A 'horseshoe' theorem. We now state a result that we estabhsh 
as a key step in the proof of Theorem \^ which may be of independent 
interest. 

Proposition 3.3. Let X be a shift space and suppose that Q d C 



satisfies [A.lj and [A. 2]. Then there exists an increasing sequence 
{Xn\ of compact a -invariant subsets of X with the following properties. 

(1) For every n and every w G £(X„), there exist u,v ^ C with 
\u\, \v\ < n + T such that uwv G Q. In particular, this implies 
that each X^ has the (W)- specification property. 

(2) Every invariant measure on X is entropy approachable by er- 
godic measures on X^.: for any rj > 0, any /i G A^o-(X), and 
any neighborhood U of fi in A^o-(X), there exist n > 1 and 
/i' G Ml{Xn) n U such that h{^') > h{^) - rj holds. 

By the variational principle and the entropy approachability in Propo- 
sition [3l3l we have the further result that lim„^oo h{Xn) = h{X), and 
more generally 

P{X,^) = lim P(X„,(^) = supP(X„,,(^) 

n-i-oo „gp^ 

for every ip G C{X). Thus, we can interpret the sets Xn as well behaved 
'horseshoes' which can be used to approximate the original space X, 
revealing a structure reminiscent of Katok horseshoes [20]. Similarly, 
the filtration £ = IJA/gN ^^^ '^^ ^^^ language of the shift can be con- 
sidered to be analogous to the filtration of a non-uniformly hyperbolic 
set into Pesin sets. 

Remark 3.4. In the case of coded systems, the subshifts Xn also 
satisfy [J^X„ = X (see Lemma 1431) . 

In the proof of the main results, we will use the following consequence 
of the first property in Proposition 13.31 If a measure m & A4 {X) is 
Gibbs with respect to Q, then m has the following Gibbs property on 
the family of subshifts {X„}: there exist constants Kn,K' > such 
that for every x G X^ and /c G N, we have 

This follows from the fact that xi- ■ ■ Xk can be extended to a word in 
Q by adding a word to each end whose length is bounded by a constant 
depending only on n. 
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4. Proof of Theorem [A] 

The large deviations property in Definition 12.11 comprises an upper 
bound and a lower bound. We establish the upper bound first, then 
prove Proposition 1331 which is the key to establishing the lower bound. 
For the upper bound, we use criteria given by Pfister and Sullivan in 

ED. 



4.1. Upper bound. Given /i e Ma(X), let q'^(n) = h{fi) + J (fdfi — 
P{(p), as in ( II. ip . We show that for any closed set F C M.{X), we 
have 

(4.1) limsup-logm(£^^(F)) < sup g'^(/i). 

n^oo n fj.eFnMa{x) 

Our key tool is the following result, which is a combination of The- 
orem 3.2 and Proposition 4.2 of 



Theorem 4.1. [311 Theorem 3.2 and Proposition 4.2] Let {X,a) be a 
subshift, m G Ai{X), ip G C{X), and assume that the equation 

(4.2) limsup sup I —\og7n{[w]) -\ — sup Snip{w) I < 

holds. Then 

limsup — log m(£^~^(F)) < sup I h{^) — I ipdfi J . 

We will apply Theorem 14.11 with if) = P{(p) — ip. The upper Gibbs 
bound in [A. 3] (see (12. 4p ) yields a constant K' such that 

for every x G [w]. Thus 

-logm([u;]) + - sup SniPi^) - ^)ix) <-\ogiK') -^ 0, 
n n ^,g[^] n 

for every x G [w\. This establishes (14. 2p . so Theorem 14.11 provides the 
desired upper bound. 

4.2. The gluing map. The specification property allows us to define 
the following gluing map, which we consider in both Lemma 14.101 and 
Appendix[Xl Given a collection of words T> <Z L with (W)-specification, 
we write D* for the set of all finite sequences (w^, . . . , w™) where each 
U7* G v. For each (w^,...,w'^) G 2^*, we use (W)-specification to 
choose a word 

(4.3) $(w\ ...,«;"*):= w'v' ■ ■ ■ v^'-'w"", 
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SO that v^,...,v'^~^ are words with |f'| < r and $(w^, . . . , w™) G 
C. Selecting such a word for each [w^, . . . , w"^) G V* defines a map 
$ : 2)* — )■ £ which we call the gluing map. The map $ extends from 
D* to V^ in the natural way. Note that if V satisfies the stronger 
assumption of the (gcW)-specification property, then each of the words 
$(w^, . . . , w™) also belong to V. 

When V has (S)-specification, we are able to further require that 
the words v^,...,v'^~^ in (14. 3p have equal length, and it is obvious 
that for any choice of ni, . . . , n^, the restriction of the gluing map $ to 
IliLi ^n, is injective. 

When V only has (W)-specification, we need to handle the possibility 
that $ may fail to be injective on ni=i ^"i- Because of the possibility 
that the gluing words f * may vary in length, we observe that the words 
^{w) may have different lengths for different choices of w, and so we 
work with the following truncated gluing map $o • ^* ~^ ^■ 

Definition 4.2. For each ni,...,nk G N, the truncated gluing map 
$0 on nj=i^n» is the map which takes {w^,...,w'^) G 11^=1 ^n, ^^ 
$(w;^, . . . , w™), and then truncates to the first Yli=i ^i symbols. That 
is, writing N = Y^-^iUi, the map $o: Yli=i1^n, -^ ^n is given by 
z^ o $, where z^ is the map on words of length at least A^ which 
truncates to the first A^ symbols. 

Lemma 4.3. Suppose T? G C has (W)- specification. There exists 
C > such that for any ni,...,nk G N, the truncated gluing map 
$0^ rii^i'^rai ~^ ^N satisfies ^^q^{v) < C'^ for each v G Cjy, where 

Proof Define i: V* -> {0, 1, . . . ,r}* by i{w) = {\v^\, \v^\, . . . , |w^-^|), 
where Vi are the gluing words from the definition of $. We consider 
$o: ni=i^n» ~^ !-■■ The image of $o is a subset of Lfq^ where A^ = 
Sj=i'^i) SO we choose v G C^. 

We fix r = (ti, . . . , rfc„i) G {0, 1, ... , r}'^"^, and consider the num- 



W] 



ber of possible w = {w^, . . . , w'') G ni=i ^n, ^^'^'^ ^^^^ both <l>o( 
V and i{w) = f. Note that v determines the first A^ symbols of 
w^u^w^ ■ ■ ■u''~^w^, and thus it determines each symbol w* which has 
not been truncated from the end of w^u^w"^ ■ ■ ■u'^~^w^. This leaves 
'Y^^Zi Ti < kr remaining symbols w'- which are not determined by w, 
and thus 




$(«;) = V and ^{w) = t > < p 



kr 
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where p is the size of the alphabet. There are at most (r + 1)^ choices 
for r, and thus 

#$o"'(^0</"(r + l)' = (p^(r + l))^ 
which completes the proof of the lemma. D 

4.3. Proof of Proposition 13.31 . We estabhsh Proposition 13. 3[ which 
is crucial for our lower bounds. This is the most involved stage of our 
proof. 

Step 0: Definition and basic properties of X^. First, we 
define the sequence of shift spaces Xn which will meet our requirements. 

Let Q<n = ljr=o^«' ^^^ consider the set of words 

oo 

(4.4) ^g^J := [j mwi, ...,Wm)\wie g<n, l<t<m}. 

m=l 

We can turn this set into the language of a shift space by including all 
subwords to obtain 

(4.5) jCi^Xn) := {all subwords of elements of $(^<„)}. 

Then, Xn is defined as the shift whose language is C{Xn) - that is, Xn 
is the set of sequences for which every finite subword is in £(X„). 

Lemma 4.4. X„, is a well-defined shift space. Furthermore, Xn has 
the following properties. 

(1) For every w G £(X„), there exist u,v E C such that \u\, \v\ < 
n + T and uwv G Q, where r is the transition time in the (gcW)- 
specification property of Q . 

(2) Xn has (W) -specification. 

Proof. By [2H Proposition 1.3.4], to check that a collection of words is 
a language for some shift space, we need only check the following two 
conditions. 

• Every subblock of £(X„) is in £(X„) 

• For every w G £(X„), there are non-empty u,v E C{Xn) so 
that uwv G C{Xn). 

The first condition is satisfied by definition, so we verify the second. 
Let w G C{Xn). Then w is a subword of some word of the form 
<l>(u7i, . . . , Wm), and for any u G Q<n, the word <I>(m, Wi, . . . , Wm, u) sat- 
isfies the required condition. 

To check the first property claimed for Xn, we observe that every 
w G C{Xn) is a subword of w^v^ ■ ■ ■ w"^ for some w'' E Q and f * such 
that the conditions in Definition 12.31 hold. In particular, by appending 
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at most n + T symbols to either end of w, we obtain a word of the form 
w*f ' ■ ■ -w^ E Q. 

This observation also implies that X„ has (W)-specification. Indeed, 
given any collection of words w^, . . . , w"^ G £(X„), we extend these to 
words w^, . . . , w'^ G ^ by appending at most n + r symbols to either 
end of the word w-^, and then use (W)-specification for Q to conclude 
that Xn has (W)-specification with transition time 3r + 2n. D 

We also include the following lemma for completeness, although it 
plays no role in our proof. 

Lemma 4.5. Suppose that we also have the following condition (which 
is satisfied for all our examples): For every v & C, there exist words 
u,w & C so that uvw G Q. Then the shift spaces provided by Lemma 
\l^satisfyX = [Jx;. 

Proof. It suffices to show that C{X) = [J^£(X„). This is easy, since 
by assumption for any w E C, there are u,v such that uwv G Q. Thus 

uwv G X^uwv\- Thus w G C{X\uwv\)- D 

The rest of the proof of Proposition 13.31 is an extension of the ap- 
proach used by Pfister and Sullivan in [32]: 

(1) Construct a subshift Y C X„ for some n > 1 such that every 
I' G Aia(y) is weak*-close to fi. 

(2) Use edit approachability of £ by ^ to explicitly build a subshift 
H gY with a rich structure. 

(3) Show that H (and hence Y) has entropy close to h{fi) by using 
this structure. 

(4) Obtain the measures /j,' as maximal entropy measures for Y. 

In preparation for the above steps, fix 77 > and use the ergodic 
decomposition of /i together with affinity of the entropy map to find 
A = XliLi ^if^i ^'^ch that 

• the /ij are ergodic; 

• the Qi are rational numbers in [0, 1] such that Yl^=i '^i = 1? 

• D{n,X) < T]; 

• h{X) > h{fi) -rj. 

Let hi = when h{fj,i) = 0, and max(0, /i(/ii) — rj) < hi < h{fii) 
otherwise. 

Definition 4.6. Given i^ e M{X) and C > 0, let 

r-^'C ■={w eC\ D{S\^\{x), 1^) <C for all x G H). 

Combining [311 Propositions 2.1 and 4.1], we have the following 
lemma. 
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Lemma 4.7. [3T| Propositions 2.1 and 4.1] There exists N ^ N such 
that for n > N and 1 < i < p, we have 7^£^'''' > e"'''. 

Because C is edit approachable by Q, there is a mistake function g 
such that every w ^ C has v ^ Q with d{v, w) < g{\w\). By Lemma [278| 
we can choose N large enough so that, in addition to the cardinality 
estimates in Lemma 14.71 we have the following property. 

• If n > A^ and x,y G X are such that d{xi ■ ■ ■ Xn, Hi- ■ ■ Vm) < 
r + g{n), then D{£n{x),8m{y)) < V- 
Without loss of generality, assume that a^ < 1 for each i. Choose n 
such that we have rii := Uin e N, rij + g{ni) < n, and Ui > N for every 
i, and moreover 

(4.6) r^(/^(A) -v)> h{\) - 2r^. 

To prove the proposition, we will follow the steps listed above to show 
that there exists /i' G A^^(X„) such that D{fi,fi') < Grj and h{fi') > 
h{fi) - 4r]. 

Step 1: Definition of Y C Xn- Fix K G N such that 4/K < r]. 
Now let 

(4.7) Y := {x e Xn \ XtXt+i ■ ■ -Xt+Kn-i e C^^^ for all t > 0}. 

Then Y C X„ is compact and cr-invariant. Moreover, the following 
holds. 

Lemma 4.8. We have D{fi, u) < Grj for any v G A^^(F). 

Proof. Since u is ergodic, there exists a generic point x G F, that is, 
Sm{x) converges to z/. We choose L so that nK/L < tj holds, take 
an arbitrary integer m > L and choose integers s and < g < Kn 
so that m = sKn + q holds. Then, using (14. 7p and the inequalities 



^ < 4^ < 77, we have 

s-1 



i=0 



— Z}(£^„(a^^"x),/i) + ^ 
m m 



D{SUx),f^) < V Z^(^i,„(a^^"x),/i) + ^D(^,(a^^"x),/i) 



< 5t] + rj = Grj. 
Thus taking ?7i — ;■ oo, we have the lemma. D 

Step 2: Construction of H. For brevity of notation we write 
P* = Cl^''^. Extend the definitions of ni,V\ fii,ai to indices i > p hj 
repeating periodically: that is, if i = pg + r, 1 < r < p, then rij = n^, 
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By the assumption that C is edit approachable by Q^ we can define a 
map (j)g: C ^ Q such that d{w, (pg{w)) < g{\w\). We can define a map 
^EG from the set of finite collections of words C* to Q by 'editing then 
gluing'. That is, given {w^, . . . , w") e C, we put ^eg{w^, ■ ■ ■ ,W^) = 
$(</)g(u7^), . . . , (f)g{w"')), where <l> is the gluing map f l4.3p . The map ^eg 
extends to subsets of £^ in the natural way, and we consider it here 
with the following domain: 



(4.8) <!>EG- HV^^X 



In other words, given w = {w^ G Ilfci^"'' ^^^ ^'' — 't'gi.'^^) ^ Q ^"^^ 
^eg{^) = v^u^v'^v? ■ ■ ■ , where u^ with \u^\ < r are the "gluing words" 
provided by (W)-specification. Let H = ^dljli^'')- Then we have 
H C Xn since rij + g{ni) < n. 

A sort of periodicity is built into the definition of the sequences 
0(w): the word v^ is an approximation of a suitable generic point for 
the measure /ij, and the measures /Xj repeat periodically (/ij+p = yUj). 
The following lemma states that following (/'(w) for a single "cycle" 
of this periodic behaviour gives a good approximation to /i. We write 
Ij = ijiyv) = \v^u^ for the length of the words associated to the index 
j, and observe that \ij — nj\ < r + gijij). 



Lemma 4.9. Fix w G Ilfli'^"'- ^^'^ Q' > 0, let c, 



q 



'Y7r=i^'ip+r be the length of the qth "cycle" in (/'(w) and let bm = 
&m(w) = Yl'^Jo Cq- Then we have D^Sc^ia^'^cpiw)), fi) < 3r]. 

Proof. Choose x^ G [w^] for each j G N, so that by the definition oiV\ 
we have D{Snj{x^), fij) <r]. Let y = cr''™0(w) and let dj = YliZo ^mp+i 
for 1 < j < p. By the definition of and the property following 
Lemma [4.71 we have 



D{Se^{a'^y),^ij) < D(^,,(a'^^i/),£„,(x"^^+^)) + D(£„^.(x'"^+^),/x,) < 2r/. 

Observe that Cg ^ n: more precisely, we have 

p 
(4.9) \cg - n| < ^ \igp+r - ^r| < p(r + gin)). 

r=l 
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Taking convex combinations gives 

( "P P p \ 



p 

2?7 < 3?7, 



ttj 



provided A^ is chosen large enough such that n > Uj > N, and such 
that fl4.9P guarantees we have y^^.i la,- -\ "^ V- D 

^ ^ ^ — 'J — J- ' •' Cm ' 

We are now in a position to show that H dY . Given y = 0(w) G H 
and t G N, we can choose mi,m2 such that 

bm^-i < t < bm^ < bm2 < t + Kn < bm2+l, 
and so 

Kn 



£Kn{a'y)= [ Yl |^^c,(a'^y))+ei^fe™,-t(^*y)+6^i+X„-6™.(^''"^y), 



where < ^1,^2 < ""'"^ xn — ^' Each of the empirical measures in 
the large sum is within St] of /i, by Lemma 14.91 and thus we have 

L'(£xn,(c^*Z/),Ai) <5r/. 

In particular, this shows that y ^Y. 

Step 3. Estimation of entropy of H. Now we use the definition 
of H to estimate its topological entropy. Our key tool will be the 
estimates obtained in Lemmas 12.61 and 14. 3[ 

Lemma 4.10. The topological entropy of H is at least h{fi) — At]. 

Proof. Fix ?Ti G N and set b' = n + pr + X]f=i fl'(^i)- Note that 

Ti 

(4.10) mb' > sup 6,„(w) and -(/^(A) - r]) > ^(A) - 2r/ 



rij 



holds, where &m(w) is as in Lemma I4l9l Moreover, since n = ^^=1 
and each Uj > N, we have b' > pN. 

Let (pm '■ Y[T=i '^^ ~^ ^mh' be the map that takes w^, . . . , w'^p to the 
first mb' symbols of $£;g(w), where w G H^i '^^ ^^^ ^eg is the 'edit 
and glue' map from Step 2. Note that ^mdl^i^"') "^ ^mb'{H). 

Now in order to estimate the entropy of H, we will use our estimates 
on the cardinality of V^ together with a bound on ^(j)^{v) for v G 
£m6'-Recall that 0g : £ — )■ ^ is a map which satisfies d{w,4>g{w)) < 
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(^(|u;|). First we use Lemma 12.61 recalling that ^^ — t- 0, to fix A'^q 
sufficiently large so that 

(4.11) #{w; G £ I 0e(w;) = t;} < e"!''!/^ 

for every v E Q with \v\ > Nq. 

Now as w ranges over V\ the word (t)g{w) may vary in length; how- 
ever, since its rf-distance from w is at most (7(|w|), the number of 
different lengths it can take is at most 2g{\w\). Similarly for w = 
{w^, ... ,10"^^): given w G YYj^=i'^^ ^ ^^ have ^Eciw) ^ Hjlfi ^n' ^^ 
some n' = (n[, . . . ,n'^p). We see that each n'j can take at most 



2g{\w^\) = 2g{nj) < 2g{n) different values, and so in particular, as 
w ranges over YlTl 
bounded above by 

(4.12) (2^(n))™P = e"*P'°s(23(n,)) < ^^» min, „,. < ^v^b'/3^ 



w ranges over Y[T=i'^'' ^ ^^^ number of different values taken by n is 



,,log(29(n)) 



where the last inequality follows from observing that Uj ^ ajU and 
choosing A^ sufficiently large (since each Uj > N). 

For each choice of n', we may bound the multiplicity of the truncated 
gluing map $o: Y[T=i^'^' ~^ -^E"' using Lemma H73l Let ^mb' be the 
map that truncates to the ffist mb' symbols (instead of ^ n'-). Because 
n'j < Uj + gijij), we have 

mp mp p 

y n'j < y. iT-j + fi'(^j) ^ iTT'iT' + ITT' /, oiiT-j) ^ irrb' . 

In particular, the ffist mb' symbols determine the ffist ^ n'- symbols, 
and we conclude that for every v G Cmh'{H) and fixed choice of n', we 
have 

Assume that A^ was chosen large enough such that this bound gives 
#$"),, (f) < e^"^^ /^. Together with the previous bound, we see that for 
every v G Cmb'iH), we have 

and thus we obtain the estimate 

mp 



#C^AH)>e-^"^''ll{i^V^). 
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Using Lemma [4 .Tt it follows that 

/ mp \ 

( mp \ 

Step 4-' End of the proof of Proposition \3.3i Let fi' be an 

ergodic measure of maximal entropy for Y. Lemma 14.81 shows that 
D[n',^) < 6?7, and Lemma [4.101 shows that h{fi') > h{fi) — Arj. Since 
Y C Xn by definition, this completes the proof of Proposition 13.31 

4.4. Lower bounds. Now we complete the proof of Theorem |A] by 
showing that the lower bound 

(4.13) lim inf — log m ({x G X : Sn{x) G U}) > supg'^(/i) 

holds for any open set U C Ai{X), where q^'^^fi) is as in (11. ip . 

To show (14.131) . it is sufficient to show that for any fi G M{X) and 
any open neighborhood U C Ai{X) of /i, 

(4.14) liminf-logm({x G X : ^„(x) G U}) > q'^ifi). 

n—^oo n 

If /i is not a-invariant, then (/"^(/i) = — oo and so the equation (14.141) is 
trivial. Thus, we will prove the equation (I4.14p for /i G M.^[X). 

Let /i G M.„[X) and 77 > 0. Then by Proposition 13.31 there exists an 
ergodic measure u G f/fl A^^(Xfc) for some k such that h{u) > h{jj) —r] 
and f fdu > f ipdfi — r]. We use u to build a subset of S~^{U), as 
follows. 

Take C > so small that M{u, 2() C U and every measure u' in 
this neighbourhood has | J ip dv' — J ^p dv\ < rj. In particular, for every 
w G C'^''', we have [w] C S~^{U). Then, again by [311 Propositions 2.1 
and 4.1], for all sufficiently large n we have 

(4.15) #(£:;'^ n C{Xk)) > e"('^('^)-''\ 
We note that by the Gibbs property (13.11) . we have 

m[w] > i^^e-"-P(^)+^"^(^) 

for all w G C{Xk)n and x G [tf]. In particular, when w G C'^''^ this 
yields 
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Using the estimate fl4.15p and the fact that [w] C S^^{U) for every 
w G >C'^, we obtain 

Since 77 > was arbitrary, this estabhshes the lower bound fl4.14p . 

5. Applications 

5.1. S'-gap shifts. To check the conditions of Theorem [B] for a poten- 
tial (f with the Bowen property, we verify the specification properties 
A.l and (I) on Q and Q'^'^ , the edit approachability property A. 2] 



and the estimate (III) on P(Cp U C^ v?). 



5.1.1. Specification properties. Condition [A.l] is immediate because 
Q has (O)-specification. Condition (I) holds because a word in Q^^ has 
the form 0"10"ilO"n . . . 10"nO™, where rii e S for all 1 < i < A;, and 
n,m < M. Thus, any word in Q^'^ can be extended to a word in Q 
by adding a uniformly bounded number of symbols at each end (the 
number of symbols to be added depends on M, but not on the length of 
the word), and this implies that Q'^^ has the (W)-specification property. 

5.1.2. Edit approachability. Because S is infinite, we can choose for 
every n E N some Sn & S such that ^ — )■ and s„ — )■ 00. (Note that 
the same element of S may appear as Sn for multiple values of n.) Now 
define (?: N — ;■ N by g{n) := 2([n/s„] + s„), and observe that g is a. 
mistake function. 

Let z G C{X)n and write s = Sn- The word z has the form 

We now change < k/s of the symbols 0^' to form the word z^ : = 
OnonO"---OnO" {0 < i < s). We also change < £/s of the sym- 
bols 0^ to form the word z' := OnonO" ■ ■■O'W (0 < j < s). We set 



w 



zP10"ilO"2...0"a2^ 



u 



d{z,uwv) < 2{\n/sn\ + s„) 

Q. This shows that Q satisfies [A. 2] 



:= 0"-* 
g{n 



and V := 0'^ ^1. Then we have 
and uwv G ^ by the definition of 



5.1.3. Estimating P^C^UC^jif). Now we show that if (/? is any potential 
with the Bowen property on an S'-gap shift, then P(C^UC'^, ip) < P{f), 
verifying Condition (III) It is easy to see that h{C^ U C^) = 0, so it 
suffices to show that 



(5.1) 



P{f) > lim sup —Snf{x). 



Our strategy is to produce a large number of admissible words that are 
close (in the edit metric) to a given word, so that no single word can 
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carry full pressure. This strategy was also used to establish (15. ip for 
/3-shifts in [SI Proposition 3.1]. For S-gap shifts, we must deal with 
a difficulty which does not occur for /3-shifts: if x G S5 is such that 
positions i and j both admit edits yielding new words x',x" G S^, it 
may not be possible to make both edits simultaneously. This lack of 
independence between the possible edits means that it is more difficult 
to produce nearby words than in the case of /3-shifts. Here, we state a 
sequence of lemmas which prove fl5.ll) , whose proofs are given in §H1 

Lemma 5.1. We have P{(p) > </3(0). 

In the following lemma, we use Lemma 15.11 to control words which 
have a small frequency of occurence of the symbol 1. 

Lemma 5.2. There exists e > and a constant L = L{e) so that 
if Xi ■ ■ ■ Xn contains fewer than en occurrences of the symbol 1, then 
T^Sn^ix) < ifiO) + L< P{^) - L. 

We now control words which do not have a small frequency of oc- 
curence of the symbol 1. This is where we use our strategy of creating 
a large number of new words by making edits. We need the following 
estimate, which is a consequence of Stirling's formula. 



Lemma 5.3. If 6n < k < ^, then log I , J > —n6\og5 — 2\ogn. 



This estimate can be used to give a lower bound on the cardinality 
of a set of words where we can control the Birkhoff averages of (f, and 
we can use this to estimate the pressure from below. 

Lemma 5.4. Given e be as in Lemma \5.S\ there exists L' > such 
that whenever n is sufficiently large and Xi • • ■ x„ G £ contains m > en 
occurrences of the symbol 1, we have ^Sn^{x) < P{(p) — L' . 

We conclude from Lemma [5.21 and Lemma [5.41 that 

lim sup —Sn'^{x) < max{P((/?) — L, Pi^^p) — L'} < P{(p), 

n^oo^fzx n 



and it is easy to verify (III) from this together with h{CP U C^) = 0. 



5.2. /3-shifts. Every /3-shift can be presented by a countable state di- 
rected labelled graph with vertices Vi,V2,- ■ ■ ■ For every i > 1, we draw 
an edge from Vi to f j+i, and label it with the value cuf . Next, whenever 
cjf > 0, for each integer from to wf — 1, we draw an edge from Vi to 
Vi labelled by that value. 

The /3-shift can be characterised as the set of sequences given by 
the labels of infinite paths through the directed graph which start at 
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Vi. For our set Q, we take the collection of words labelling a path 
that begins and ends at the vertex Vi. Thus, Q automatically satisfies 



(O)-specification, and in particular, [A.l] holds. 



Let (p: E'^ — )■ M be a continuous function satisfying the Bowen prop- 
erty. It is shown in [SI §3.1] that conditions (I) ^(III) in Theorem 13.11 



hold, so it only remains to check condition [A. 2] 

We now show that C is edit approachable (in fact, Hamming ap- 
proachable) by Q with mistake function g = 1. Let z G £„. We set 
j := max{l < i < n : Zi ^ 0} and define a new word w G £„ by 

^ ^ \zi {1 <i<n,i ^ j))] 

\zj-l {i = j). 

It is easy to see that d{z,w) = 1 = g{n) and w ^ Q, which implies 



A. 2 , It follows that (E'', a) satisfies the level-2 large deviations prin- 



ciple with reference measure m^, and rate function g*^ given by (11.11) . 



6. Proofs of other technical results 

Proof of Lemma \2.b\ We obtain an upper bound on the number of 
words that can be obtained by making at most m edits to w as follows. 
We introduce an additional symbol e (for 'edit'), and construct a new 
word w' of length n + m which contains exactly m of the symbols e, and 
so that Wi = w'l. Note that ( ""J;™') is an upper bound on the number of 
such words w'. Now obtain a new word f G £ from w' by performing 
exactly one of the following actions at each symbol e, and then deleting 
the e. 

(1) Change the symbol immediately before e to a different symbol. 

(2) Insert a symbol immediately before e. 

(3) Delete the symbol immediately before e. 

(4) Leave the symbol immediately before e unchanged. 

Note that every word v which satisfies d{v, w) < m can be produced by 
this procedure. At each symbol e, there are a total of 2^ A + 2 possible 
actions, so we see that 



\m. ( n+m ' 
n 



Mv\d{vM]<{'^*A + 2y 
Prom Stirling's formula there is a constant C such that 
I log n\ — (n log n — n)\ < C log n 
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for every n E N, and so when m < 6n we have 

log ( "+"' ) = log(n + m) ! — log n\ — log m\ 

< ((n + ?Ti) log(n + m) — ralogn — ??7,log?Ti) 

+ C"(log(m + n) + log m + log n) 

n + m , n + m\ ^^, , , , 

n log h m log + SC log(m + n) 

n m J 

< n (log(l + 5) + 51og(l + r^)) + 3C"log((l + 5)n) 
= n ((1 + 5) log(l + 5) -5\og5) + 3C"log((l + 5)n). 

Using the inequalities 1 + 5 < 2 and log(l + 5) < 5, we see that the 
left-hand side of (12.21) admits the bound 

#{t; G C I d{v, w) < 5n} 

which completes the proof. D 

Proof of Lemma \2. <Sl Let g{n) = g[n) + 1, so that g is also a mistake 
function. Take x, y and ?Ti, n as in the hypothesis of the lemma, and 
let k = d{xi---Xn, yi---ym) <9{n). 

Following the set-up of the proof of the previous lemma, we obtain 
a new word w' by inserting the symbol e into k positions oi xi- ■ ■ Xn 
to mark where an insertion, deletion or substitution will take place to 
obtain yi- ■ ■ ym- We write w = w^w'^ ■ ■ ■ w^~^^ so that the last symbol 
of each w'^ with 1 < i < A; is e (note that w''~^^ may be the empty word). 
Let wl be the word obtained by omitting the last two symbols from w\ 
and form the word Wr = wlw'^ ■ ■ ■ w^~^^ (where r stands for 'reduced', 
and ii \w^\ <2, then w* is the empty word). For n > 0, let 

V{n) = S\ip{\Sm"-p{x) - Sm'-p{,y)\ \Xi-- ■Xn = yi---yn 

and m, m' G {n, n + l,n + 2}}. 

Note that continuity of ip implies that -V{n) -^ 0. In particular, for 
z > 1, we may write e{z) = sup„>2 :^V{m) and obtain e{z) — >■ 0. We 
will use this fact for "long" words, while for "short" words we will use 
the bound V{n) < 2{n + 2)y\\ < 4(n + 1)||(^||. 
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Both xi- ■ -Xn and yi- ■ -ym can be obtained from Wr by inserting at 
most two symbols at the end of each subword wl, and so 

fe+i 
(6.1) \S.Mx)-S^^iy)\<J2vinj), 



where n, = \wi\. To bound this sum, we let C„ = . hn^ and break the 
sum into two parts, corresponding to rij < Cn and rij > Cn- We have 

(6.2) < J2 4K + 1)11^11+ Y^ ^ACn) 

nj<Cn nj>Cn 

< ACnW^Wgin) +ne{Cn), 

where the last inequality uses the fact that there are k + 1 < g{n) 
values of j in total, and that ^ rij < n. 

Now we can estimate the difference in Lemma 12.81 as 



1^ , . 1 



n m 



1 



Sn^{x) S^ifiy) < - \Sn(p{x) - Sm^{y)\ 



n 



1 1 
n m 



\Sm'^ 



^ All 11^ ^(^) /^ \ \m — n\ 1 .^ , ,. 
n n m 

\ n n 

Because ^ is a mistake function, the first and third terms go to as 
n — 7- oo, while C„ — ?■ oo and so the second term goes to as well. This 
completes the proof of Lemma 12.81 D 

Proof of Proposition \2.9i Clearly P{Q, (p) < P{f), so it suffices to prove 
the other inequality. We compare A„(£, (^) and A„{Q,ip) using Lem- 
mas 12.61 and 12.81 By edit approachability, for each w & Cn there exists 
V = v{w) G Q such that d{v,w) < g{\w\). Lemma [22] tells us that 
given V & Q, the number of words w ^ Cn for which v = v{w) is at 
most 

where 6 = g{n)/n. In particular, for all sufficiently large n this expres- 
sion is bounded above by e''"", where (J^ — )■ 0. 

It follows from Lemma 12.81 that there is 5„ — )■ such that for every 
V, w as above and any x E [v], y E [w], we have 

\Sn^ix) - Si^iip{y)\ < nSn- 
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Together the above estimates imply that 

n+g{n) 
m=n—g{n) w£Qm 

and so in particular there is m G [n — g{n),n + g{n)] such that 

2g{n) 
Since g{n) is sublinear and 6n,S'^ -^ 0, this implies the result. D 

Proof of Lemma \2.1B, Items 1) and 3) can be obtained by making mi- 
nor modifications to the proof of Proposition 2.2 in [7J §6.2], so we omit 
these arguments and prove only item 2). 

Let Q C >C(S) and Q C C{X) be as in Lemma 12.121 and assume 



that Q satisfies [A.2] , Let (?: N — ?■ N be a mistake function as in 
[A. 2 for Q. Then we define a mistake function g: N — )■ N by g{n) = 
(4r + 3)g{n + 2r) + 4r. Take a 2 G C{X)n. Since "^ is surjective, there 

we can 



exists z G C{Ti)n+2r so that "^{z) = z. Since Q satisfies [A.2 

find w G ^ so that d{z, w) < g{n + 2r) holds, where we recall that r is 

the length of the block code. We set w = "^{w). 

Because d{z, w) < g{n + 2r), there exist an integer K > n — ((2r + 
l)g{n + 2r) + 2r) and two increasing sequences mi < ■ ■ ■ < mx, ui < 
■ ■ ■ < riK so that 



■w. 



rii+r 



for each 1 <i < K. Because \E' is a block code with length r, we have 
Zrrii = Wrii for 1 < i < K. This implies that 

d{z,w) <{n-K) + {\w\ - K) < 2{n - K) + ||w;| -n| < g{n). 



Thus g satisfies |[A.2J[ D 

Proof of Lemma \3.S[ We have 

and so P{C,(f) < h{C) + supy^. By the variational principle and the 
assumption (IBRD , we have 

P{X,^) > h{X)+mi^ 

> /i(C) + sup(p > PiC,<^), 

which proves the lemma. D 
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Proof of Theorem [0 We assume the hypotheses of Theorem O By 
Lemmas 12.121 and I2.13[ jC.{X) has a decomposition C^QC^ satisfying 
satisfies pTl]] \[A^ [(!)] and h{CP U C') = 0. If </?: X - > M is a 
continuous function which satisfies the Bowen property (II) , and the 
condition supip — inicp < h{X), then by Lemma IX^ (HI) holds. Thus, 
the hypotheses of Theorem [B] are satisfied for the equihbrium measure 
flip on X, which immediately proves Theorem O D 

Proof of Lemma \57T\ Let V be such that \Sn^{x) — S'„ </)(?/) | < V when- 
ever Xi ■ ■ -Xjj = Hi- ■ -iln, and in particular Snf{x) > nip{0) — V for 
every x G [0"~H], where V = V + ip{0) - (inf </?). 

Choose k large (just how large will be determined later) and let 
ni,n2, . . . ,nk G 5 be distinct. Let n be any permutation of the integers 
{l,...,k}, and let w^ be the word 0"-(i)10"-(2)i . . . o"-(fc)l of length 
A^ = Ylj=i{^j + !)• The estimates in the previous paragraph give 



SN^iy) > NifiO) - kV 



for every y G [w^]. Now let if = (tti, . . . , vr^) be any sequence of m 
such permutations, and let Vj^ = w^n ■ ■ -w.,,^. Choosing any |/^ G [f,?], 
we obtain the estimate 



(6.3) 



AmNi'C,^) > ^e^-^^^^*) > (A;!)™e'"^^(°)-™'^^'. 



We have the general bound 

k 

(6.4) log(/c!) = J]logj> 



log tdt = k log k — k — 1, 



which yields 

logAmAr(£, ip) > ni{k\ogk — k — 1) + mNip{0) — mkV', 
so that dividing by mN and sending m -^ oo we have 

fc /, , . 1 



P(<^)>^(0) + -(logA;-l 



k 



v 



Taking k large gives the result. 



D 



Proof of Lemma \5.2[ By Lemma 15. ![ there exists e > such that 
ip{0) +2eV' < P{f), where V is the constant from the proof of the pre- 
vious lemma. Note that if xi ■ ■ ■ a;„ contains fewer than en occurrences 
of the symbol 1 , then 



Sn^p{x) < n(p{0) + enV', 
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and in particular 

(6.5) -Sn^{x) < (^(0) + eV < P{^) - eV. 

n 

Settig L = eV gives the result. D 

Proof of Lemma 15.31 We use the upper bound 

log(fc!) = ^logj < / \ogt dt = {k + l)\og{k + I) - k 
i=i -^^ 

= (A; log A; - A;) + A; log ( 1 + - j + log(A; + 1) 
< (A;logA;-A:) + (l + log(A; + l)), 
which together with (16.41) gives, for all large n, 

log I r ) = log(ra!) — log(A;!) — log((?7, — A;)!) 

> {n\ogn — n — \) — {klogk — k) — (1 + \og(k + 1)) 
— {{n — k) \og{n — k) — {n — k)) — (1 + log(?T, — A; + 1)) 

>nh\ — \—2 log n, 
\nj 

where /i((5) = -5\og6 - {I - 6)\og{l - 8). D 

Proof of Lemma \5.4\ Now assume that Xi- ■ -Xn contains m > en oc- 
currences of the symbol 1. By considering a smaller collection of indices 
where the entry is 1 if necessary, we may assume that m < 2en. 

Given 6 > small (just how small will be determined later), let 
6m < k < 26m. Let R be the set of indices in which xi- ■ ■ Xn has a 
nonzero symbol, and let Z be the collection of subsets of R with exactly 
k elements. 

We define a map (p: Z ^ X as follows. Fix n^ ^ n^ ^ S. Given 
Z & Z, at each index k E Z insert the word 0"^1 into x, unless 
x/c+i ■ ■ ■Xfc+m+i = 0*^11, in which case insert the word O"'^!. This is 
allowed by the definition of the S'-gap shift, and we note that the map 
is injective. 

Let i = max{ni,n2} + 1, and observe that (piZ) is obtained from x 
by inserting at most ki symbols, so that if p is the size of the alphabet, 
then the map $: ^ — )■ £„ obtained by truncating 4>{Z) to the first n 
symbols has the property that #<l>~^(w) < p^^ for each w G £„. 
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We conclude that the map $ yields at least ( 1' ) p~^^ words w in £„ 
with the property that 

SnViy) > SnVix) - kiV > SnVix) - 4e6nV' 

for every y E [w]. In particular, together with Lemma 15.31 and the 
conditions on m and k, this gives the estimate 

logA„(£, (f) > —mSlogS — 21ogm — kilogp + Snf{x) — Ae5nV' 

> {en){-6 log 6) -41og(en) - 4e6ilogp + Sni-pix) - 4e6nV' . 

Dividing by n gives 
-\ogAn{A,ip)>-SnV{x) + e6{-\og6-4nogp-4V')-4- ^^ ' 



n n n 

which yields the desired result when 5 is chosen sufficently small and 
n is chosen sufficiently large. D 

Appendix A. From (S)-specification to (W)-specification 

We ffil in the details of the proof that Theorem 13.11 holds as stated, 
with the assumption that each Q^'^ has (W)-specification. This theorem 
appears as Theorem C of [7j with the stronger assumption that each 
Q^ has (S)-specification. 

In [7], the hypothesis of (S)-specification for Q^ is used in exactly 
three places: in the proofs of Lemma 5.1, Proposition 5.5, and Lemma 
5.9. We describe how Lemma 14.31 lets us prove these intermediate 
results using (W)-specification for Q^'^ . 

Given M > 0, let (^^^)* denote the set of finite sequences (w^, . . . , w'^) 
where each w^ G Q^^ . Define a map $: (^*'^)* — )■ £ as in f l4.3p . and 
let ^o{'w) be the truncation of <l>(w) to its ffist A^ symbols, where 
A^ = X]i=i l"^*!- -'^y Lemma 14.31 for every M > there exists a con- 
stant Cjv/ > such that for each ni, . . . , n^ G N, the map 

k k 

4=1 i=l 

has the property that 

(A.l) #$o~'(^) < C'/ for every v G Cm- 

For [HI Lemma 5.1], we can repeat the argument from that paper and 
apply (lA.ip . where $o has the domain Y[i=iiG'^)nXo get the bound 



AkniC,^) > A„(^^,y,)'=(e-(^-+*-ll^ll)Ci;/)^ 
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which yields 

-K{Q'\^) < P{v) + -{Vm + tMll^ll + log Cm) 

n n 

and shows that there exists Dm > with Kn{Q^\^p) < Due^^^'^'' for 
all n. 

For [SI Proposition 5.5], we follow the proof in that paper and take 
M such that A„(^^^(^) > Ce"^(^) for all n. Fixing w E G^ , we let 
n = \w\ and estimate Umicr'^ {w)) for < k < m. Let i = m — k — n 
and define maps 

$ : {g^)k X ig'% ^ £„, f : {g'% x (^^^), -> {o, i, . . . , tMV 

by $(w7^,w7^) = vi...Vm, where v = w^u^wu^w"^ and riw^jw"^) = 
{\u^\,\u^\). Now there exist ti,T2 G {0,1,...,^^}^ such that 



E e-<»l"-» > -4-j E e' 



</'m(*(*'l,t'2)) 

(tAf + l)2 ^ 

-r(uil ,ui2) = (ri ,r2 ) 

where (Pm{u) = sup3,g[„] Sm^p{x) for u G £„• In particular, the same 
computation as in [8] gives 

Summing over k gives the desired lower bound for v^ (at the cost of a 
further factor of [tu + 1)~^), and passing to m — > oo gives the Gibbs 
bound for /i. 

It remains only to prove [SJ Lemma 5.9]; the proof is similar to [H 
Proposition 5.5]. We show that for all sufficiently large M, there exists 
Em > such that if m, f G Q^^ , then 

(A.2) lE^ fi{[u] n (T""H) > Er,^fi{u)^ji{v). 

n— >oo 

In [Sj, Lemma 5.9 provides the inequality f lA.2p with a liminf instead 
of a limsup, which is a slightly stronger result. This lemma is only 
used in the proof of [SJ Proposition 5.8], which establishes ergodicity of 
the equilibrium state. One can easily verify that this ergodicity proof 
only requires the lim in the inequality (IA.2p . 

To this end, we take M large enough that A„(^^^ ip) > Ce"^(^\ fix 
u,v E Q^\ take m G N large, and fix A; < m. We estimate 

(z/„ o a"^)(M n a-"H) = u^{a~\u] n a'^'^^^]) 

for some k G [k,k + Im] and some n G [n,n + 2tAf]- We let d = 
m — n — k — \u\ — \v\ and observe that (W)-specification allows us 
to associate to every {w^,w'^,w^) G (^^^)fc x {g^'^)n x (^^^)^ a word 
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V G Cm such that v is the first m symbols oiw^x^ux^w'^x^vx'^w^, where 
|a;*| = Tj G {0, 1, . . . ,tM}- As in the modified proof of [S], Proposition 
5.5], we can choose (ri, r2, Ts, r^) such that the set of triples {w^, w'^, w^) 
for which |x*| = Tj has weight at least (tM + 1)~^ of the total (over all 
triples), and then putting k = k + ti and n = T2 + n + t^, we have 

{um o a-^)(M n a""H) > ^lif2^e-^*«^(^)e-(3^+^*-ll'^ll)/i(w)/i(t;). 

Summing over k gives us the result for fi„i at the cost of another factor 
of {tM + l)~"^(2tAf + 1)""^, and then sending m — )■ 00 gives (IA.2p . 
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